Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay

نویسندگان

  • Ionel-Alexandru Hosu
  • Traian Rebedea
چکیده

This paper introduces a novel method for learning how to play the most difficult Atari 2600 games from the Arcade Learning Environment using deep reinforcement learning. The proposed method, called human checkpoint replay, consists in using checkpoints sampled from human gameplay as starting points for the learning process. This is meant to compensate for the difficulties of current exploration strategies, such as ε-greedy, to find successful control policies in games with sparse rewards. Like other deep reinforcement learning architectures, our model uses a convolutional neural network that receives only raw pixel inputs to estimate the state value function. We tested our method on Montezuma’s Revenge and Private Eye, two of the most challenging games from the Atari platform. The results we obtained show a substantial improvement compared to previous learning approaches, as well as over a random player. We also propose a method for training deep reinforcement learning agents using human gameplay experience, which we call human experience replay.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prioritized Experience Replay

Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experienc...

متن کامل

Playing Atari with Deep Reinforcement Learning

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learnin...

متن کامل

Playing Games with Deep Reinforcement Learning

Recently, Google Deepmind showcased how Deep learning can be used in conjunction with existing Reinforcement Learning (RL) techniques to play Atari games[10], beat a world-class player [13] in the game of Go and solve complicated riddles [3]. Deep learning has been shown to be successful in extracting useful, nonlinear features from high-dimensional media such as images, text, video and audio [...

متن کامل

Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

The process for transferring knowledge of multiple reinforcement learning policies into a single multi-task policy via distillation technique is known as policy distillation. When policy distillation is under a deep reinforcement learning setting, due to the giant parameter size and the huge state space for each task domain, it requires extensive computational efforts to train the multi-task po...

متن کامل

Vision-based Deep Reinforcement Learning

Recently, Google Deepmind showcased how Deep learning can be used in conjunction with existing Reinforcement Learning (RL) techniques to play Atari games[11], beat a world-class player [14] in the game of Go and solve complicated riddles [3]. Deep learning has been shown to be successful in extracting useful, nonlinear features from high-dimensional media such as images, text, video and audio [...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1607.05077  شماره 

صفحات  -

تاریخ انتشار 2016